The object of this exercise is to gain familiarity with the Dremio workflow and UI. The general steps for this exercise are: 1. Create a query in Dremio for data files in a folder. 2. Create a virtual dataset that is a version of the physical dataset with datatypes updated. 3. Bring the virtual dataset into PowerBI from Dremio and create a simple visualisation. --------------------------------------------------------------------------------- Step 1 - Setup the data sources * Create a folder in your home space to organise the queries for this exercise. * If you haven't already done so, create a data lake entry for the training AWS account. --------------------------------------------------------------------------------- Step 2 - Query for Covid Data In the Dremio Sources pane, navigate to the S3 folder 'data-eng-21'. Navigate to the folder owid-covid, click the "format folder" icon on the right-hand side. Ensure the folder structure is being detected correctly. When you are happy with the structure of the dataset, click "Save As" and save this dataset in your new folder. --------------------------------------------------------------------------------- Step 5 - Bring the data into PowerBI To bring the data into power BI you should have the Dremio odbc/Power BI driver installed, you can either: a) Select your dataset in Dremio, click the Power BI icon in the top right and open the file that downloads OR b) Open Power BI, select "Get Data", search for Dremio, and import the joined dataset --------------------------------------------------------------------------------- Step 6 - Simple Visualisation (Optional) Using Power BI create some simple visualizations to demonstrate the change in covide cases over time.